team play
From motor control to embodied intelligence
Using human and animal motions to teach robots to dribble a ball, and simulated humanoid characters to carry boxes and play football. Five years ago, we took on the challenge of teaching a fully articulated humanoid character to traverse obstacle courses. Here, we describe a solution to both challenges called neural probabilistic motor primitives (NPMP), involving guided learning with movement patterns derived from humans and animals, and discuss how this approach is used in our Humanoid Football paper, published today in Science Robotics. We also discuss how this same approach enables humanoid full-body manipulation from vision, such as a humanoid carrying an object, and robotic control in the real-world, such as a robot dribbling a ball. An NPMP is a general-purpose motor control module that translates short-horizon motor intentions to low-level control signals, and it's trained offline or via RL by imitating motion capture (MoCap) data, recorded with trackers on humans or animals performing motions of interest.
From Motor Control to Team Play in Simulated Humanoid Football
Liu, Siqi, Lever, Guy, Wang, Zhe, Merel, Josh, Eslami, S. M. Ali, Hennes, Daniel, Czarnecki, Wojciech M., Tassa, Yuval, Omidshafiei, Shayegan, Abdolmaleki, Abbas, Siegel, Noah Y., Hasenclever, Leonard, Marris, Luke, Tunyasuvunakool, Saran, Song, H. Francis, Wulfmeier, Markus, Muller, Paul, Haarnoja, Tuomas, Tracey, Brendan D., Tuyls, Karl, Graepel, Thore, Heess, Nicolas
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents. Recent research in artificial intelligence has shown the promise of learning-based approaches to the respective problems of complex movement, longer-term planning and multi-agent coordination. However, there is limited research aimed at their integration. We study this problem by training teams of physically simulated humanoid avatars to play football in a realistic virtual environment. We develop a method that combines imitation learning, single- and multi-agent reinforcement learning and population-based training, and makes use of transferable representations of behaviour for decision making at different levels of abstraction. In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds. We investigate the emergence of behaviours at different levels of abstraction, as well as the representations that underlie these behaviours using several analysis techniques, including statistics from real-world sports analytics. Our work constitutes a complete demonstration of integrated decision-making at multiple scales in a physically embodied multi-agent setting. See project video at https://youtu.be/KHMwq9pv7mg.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- (17 more...)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- Leisure & Entertainment > Games (1.00)
- Education (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
Multi-Agent Collaboration via Reward Attribution Decomposition
Zhang, Tianjun, Xu, Huazhe, Wang, Xiaolong, Wu, Yi, Keutzer, Kurt, Gonzalez, Joseph E., Tian, Yuandong
Recent advances in multi-agent reinforcement learning (MARL) have achieved superhuman performance in games like Quake 3 and Dota 2. Unfortunately, these techniques require orders-of-magnitude more training rounds than humans and may not generalize to slightly altered environments or new agent configurations (i.e., ad hoc team play). In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play. We first formulate multi-agent collaboration as a joint optimization on reward assignment and show that under certain conditions, each agent has a decentralized Q-function that is approximately optimal and can be decomposed into two terms: the self-term that only relies on the agent's own state, and the interactive term that is related to states of nearby agents, often observed by the current agent. The two terms are jointly trained using regular DQN, regulated with a Multi-Agent Reward Attribution (MARA) loss that ensures both terms retain their semantics. CollaQ is evaluated on various StarCraft maps, outperforming existing state-of-the-art techniques (i.e., QMIX, QTRAN, and VDN) by improving the win rate by 40% with the same number of environment steps. In the more challenging ad hoc team play setting (i.e., reweight/add/remove units without retraining or finetuning), CollaQ outperforms previous SoTA by over 30%. In recent years, multi-agent deep reinforcement learning (MARL) has drawn increasing interest from the research community. MARL algorithms have shown superhuman level performance in various games like Dota 2 (Berner et al., 2019), Quake 3 Arena (Jaderberg et al., 2019), and StarCraft (Samvelyan et al., 2019). However, the algorithms (Schulman et al., 2017; Mnih et al., 2013) are far less sample efficient than humans.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia > Middle East > Jordan (0.04)
Of course 'Battlefield V' is getting a battle royale mode
EA started off its EA Play show at E3 answering a question on everyone's lips: Will Battlefield V follow the shooter vogue and have its own battle royale mode? While the game's leads introducing the feature offered no details, the franchise's Twitter account tweeted that it will have team play and vehicles. Time will tell if the game puts its own spin on the mode or is just trying to cash in on the zeitgeist, but at least it won't have loot boxes. Royale is coming to #Battlefield V, reimagined with the core pillars of destruction, team play, and vehicles. It will be unlike anything you've played before, and we'll have more to talk about later this year.
- Information Technology > Communications > Social Media (0.73)
- Information Technology > Artificial Intelligence > Games > Computer Games (0.67)